A Dataset for Joint Noun-Noun Compound Bracketing and Interpretation

نویسنده

  • Murhaf Fares
چکیده

We present a new, sizeable dataset of noun– noun compounds with their syntactic analysis (bracketing) and semantic relations. Derived from several established linguistic resources, such as the Penn Treebank, our dataset enables experimenting with new approaches towards a holistic analysis of noun–noun compounds, such as jointlearning of noun–noun compounds bracketing and interpretation, as well as integrating compound analysis with other tasks such as syntactic parsing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Search Engine Statistics Beyond the n-Gram: Application to Noun Compound Bracketing

In order to achieve the long-range goal of semantic interpretation of noun compounds, it is often necessary to £rst determine their syntactic structure. This paper describes an unsupervised method for noun compound bracketing which extracts statistics from Web search engines using a χ measure, a new set of surface features, and paraphrases. On a gold standard, the system achieves results of 89....

متن کامل

A Taxonomy, Dataset, and Classifier for Automatic Noun Compound Interpretation

The automatic interpretation of noun-noun compounds is an important subproblem within many natural language processing applications and is an area of increasing interest. The problem is difficult, with disagreement regarding the number and nature of the relations, low inter-annotator agreement, and limited annotated data. In this paper, we present a novel taxonomy of relations that integrates p...

متن کامل

Multiword noun compound bracketing using Wikipedia

This research suggests two contributions in relation to the multiword noun compound bracketing problem: first, demonstrate the usefulness of Wikipedia for the task, and second, present a novel bracketing method relying on a word association model. The intent of the association model is to represent combined evidence about the possibly lexical, relational or coordinate nature of links between al...

متن کامل

Linked Open Data and Web Corpus Data for noun compound bracketing

This research provides a comparison of a linked open data resource (DBpedia) and web corpus data resources (Google Web Ngrams and Google Books Ngrams) for noun compound bracketing. Large corpus statistical analysis has often been used for noun compound bracketing, and our goal is to introduce a linked open data (LOD) resource for such task. We show its particularities and its performance on the...

متن کامل

Scaling Up BioNLP: Application of a Text Annotation Architecture to Noun Compound Bracketing

We describe the use of the Layered Query Language and architecture to acquire statistics for natural language processing applications. We illustrate system’s use on the problem of noun compound bracketing using MEDLINE.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016